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1  Introduction 

One  of  the  primary  tasks  of  a  computer  vision  system  is  to  reconstruct  from  two- 
dimensional  images  certain  three-dimensional  properties  of  a  scene  such  as  motion,  shape, 
and  the  spatial  arrangment  of  objects  in  the  scene.  In  monocular  vision,  the  goal  is  to 
recover,  from  time-varying  images,  the  relative  motion  between  a  viewer  and  the  environ¬ 
ment  as  well  as  the  structure  of  the  environment.  The  structure  of  the  scene  is  usually 
defined  in  terms  of  the  relative  depth  of  points  on  the  visible  part  of  the  surface  of  the 
scene. 

An  important  issue  in  motion  vision  is  whether  the  solution  can  be  determined 
uniquely.  Alternatively,  one  may  ask  if  there  exist  situations  that  give  rise  to  an  am¬ 
biguity  in  the  interpretation  of  three-dimensional  motion  and  shape. 

Apparently,  Hay  [1966]  was  the  first  to  report  the  observation  that  two  planar  surfaces 
each  undergoing  different  rigid  motions  can  produce  the  same  instantaneous  motion  in 
the  image  plane.  The  same  observation  and  the  proof  of  the  existence  of  at  most  two 
solutions  in  the  case  of  planar  surfaces  has  since  been  reported  by  Tsai  et  al.  [1982], 
Waxman  &  Ullman  [1985],  Longuet- Higgins  [1984],  Maybank  [1984],  and  Negahdaripour 
k  Horn  [1985]. 

In  the  case  of  curved  surfaces,  two  types  of  approaches,  based  on  local  and  global 
representation  of  three-dimensional  surfaces,  have  been  pursued.  In  the  local  approach, 
the  surface  is  represented  by  its  Taylor  series  expansion  in  some  neighborhood  of  the 
fixation  point,  based  on  the  assumption  that  the  surface  has  continuous  derivatives  up 
to  some  order  n  in  that  region.  In  the  global  approach,  no  special  model  is  assumed  and 
the  depth  values  are  allowed  to  vary  arbitrarily  from  one  point  to  the  next. 

Using  a  local  second-order  analysis,  Longuet-Higgins  &  Prasdny  [1980]  show  that 
three  interpretations  are  possible  for  the  motion  parameters  and  the  local  structure  of 
the  surface  of  the  scene  (tangent  plane  orientation  and  surface  curvature).  Waxman  et 
al.  [1986]  derive  the  special  cases  which  give  rise  to  the  three-fold  ambiguity  observed 
by  Longuet-Higgins  ic  Prasdny.  In  addition,  they  show  that  other  situations  can  give 
rise  to  a  two-fold  ambiguity  (similar  results  were  derived  by  Negahdaripour  k  Yuille 
1986]).  Negahdaripour  [1986]  shows  that  only  certain  hyperboloids  of  one  sheet  and 
circular  cylinders  can  give  rise  to  an  image  motion  field  with  multiple  interpretations. 
The  ambiguity  of  hyperboloids  of  one  sheet  can  be  either  two-fold  or  three-fold.  The 
ambiguity  associated  with  circular  cylinders  is  two-fold  and  can  be  viewed  as  a  degenerate 
case  of  the  three-fold  ambiguity  of  hyperboloids  of  one  sheet.  Negahdaripour  '19861  also 
shows  that  most  of  the  ambiguities  observed  by  Waxman  et  ai.  are  the  shortcomings  of 
a  local  second-order  analysis  of  the  motion  field. 

The  same  problem  has  been  addressed  using  a  global  analysis.  Fang  &  Huang  1984 
.ia«-  the  correspondence  of  nine  image  points  in  two  frames  to  show  that  the  motion 
parameters  can  be  determined  uniquely  unless  the  points  tie  on  a  second-order  surface 
passing  through  the  viewing  point.  Tsai  4  Huang  [19*  i  miw  that  the  correspondence  of 
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seven  image  points  is  sufficient  to  recover  motion  uniquely  unless  the  points  lie  on  either 
two  planes  with  one  passing  through  the  viewing  point  or  a  cone  that  passes  through 
the  viewing  point.  In  an  elegant  derivation,  Horn  [1986b]  shows  that  the  class  of  curved 
surfaces  that  may  give  rise  to  the  same  motion  field  is  restricted  to  certain  hyperboloids 
of  one  sheet  that  are  viewed  from  a  point  on  their  surface.  He  also  shows  that  the  special 
cases  observed  by  Tsai  k  Huang  are  degenerate  cases  of  his  results. 

In  this  paper,  we  present  some  results  related  to  the  ambiguities  in  the  interpretation 
of  three-dimensional  motion  of  curved  surfaces.  These  results  can  be  summarized  as 
follows: 

(1)  Only  certain  hyperboloids  of  one  sheet  or  cylindrical  surfaces  that  are  viewed  by  an 
observer  moving  parallel  to  the  image  plane  can  give  rise  to  an  ambiguity  in  the 
interpretation  of  a  perspective  motion  field.  These  ambiguities  occur  under  two  rare 
conditions  and  can  be  either  two-fold  or  three-fold.  In  either  case,  the  resulting  motion 
field  is  quadratic. 

(2)  When  multiple  solutions  exist,  the  relationship  among  them  can  be  derived  in  dosed- 
form. 

(3)  With  a  large  field  of  view,  it  is  generally  possible  to  identify  the  correct  solution  by 
imposing  the  constraint  that  depth  is  positive  over  the  image  region  onto  which  the 
surface  projects. 


2  Motion  Field 


We  assume  a  viewer-centered  coordinate  system.  The  optical  axis  is  chosen  along  the 
z-axis,  and  the  image  is  formed  on  the  plane  z  —  1;  that  is,  without  loss  of  generality, 
the  focal  length  is  chosen  to  be  unity  (Figure  l).  Let  R  =  [X,  Y,  Z\T  be  a  point  in  the 
scene  that  projects  onto  the  point  r  =  [x,y,  l]r  in  the  image.  Assuming  a  perspective 
projection,  we  have 


where  Z  R  t  is  the  distance  of  the  point  R  from  the  viewer  along  the  optical  axis. 
Suppose  the  viewer  moves  with  relative  translational  and  rotational  velocities  t  and  ui 
with  respect  to  the  scene.  Then  a  point  in  the  scene  moves  with  respect  to  the  viewer 
with  velocity 


>> 


Figure  1.  Viewer-centered  coordinate  ayatem. 


l 


t 


Note  that  tfi  =  0  and,  therefore,  the  third  component  of  the  3D  vector  rt  is  always  zero, 
as  expected.  The  velocities  of  all  image  points  taken  collectively  defines  a  two-dimensional 
vector  field  that  we  call  the  image  motion  field  (Negahdaripour  &  Horn  [1987]).  This 
vector  field  has  been  referred  to  elsewhere  as  the  optical  flow  field  (see  Horn  [1986a]  or 
Negahdaripour  [1986]  for  the  distinction  between  the  motion  field  and  optical  flow). 

The  motion  field  remains  unchanged  if  the  translational  vector  t  and  the  depth  values 
Z  are  multiplied  by  the  same  constant  factor.  Therefore,  we  can  recover  depth  and  the 
translational  motion  from  image  motion  field  only  up  to  a  scale  factor;  this  has  been 
referred  to  as  the  scale-factor  ambiguity  in  motion  vision. 

3  The  Uniqueness  Issue 

The  motion  field  is  a  purely  geometric  concept.  It  is  uniquely  defined  in  terms  of  the 
observer  motion  and  the  scene  structure.  More  precisely,  once  we  specify  the  motion  of 
the  viewer  as  well  as  the  structure  of  the  scene,  the  motion  field  is  unique  as  given  by  the 
earlier  equation;  that  is, 

3D  motion  and  structure  2D  image  motion 

Ideally,  the  motion  field  over  some  region  of  the  image  can  be  used  to  recover  the 
relative  motion,  w  and  t,  as  well  as  the  structure  of  the  scene  (d.  fined  in  terms  of  the 
depth  values  of  the  points  on  the  surface  of  the  scene,  Z)  up  to  ale  factor.  The  issue 
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we  address  is  the  uniqueness  of  the  solution: 


^  .  unique? 

2D  image  motion  — » 


3D  motion  and  structure 


More  precisely,  the  question  we  answer  is:  Given  the  motion  field  over  some  region  of 
the  image,  when  can  we  recover  the  motion  parameters  and  the  scene  structure  uniquely 
(up  to  the  scale-factor  ambiguity)?  Note  that  the  solution  may  be  non-unique  because 
the  relationship  between  the  two-dimensional  image  motion  field  and  the  underlying 
three-dimensional  structure  and  motion  is  non-linear  and,  therefore,  there  need  not  be  a 
one-to-one  correspondence.  Alternatively,  we  are  interested  to  know  what  circumstances 
may  give  rise  to  an  ambiguity  in  the  interpretation  of  the  motion  field.  In  general,  the 
problem  of  robust  recovery  of  3D  structure  and  motion  can  be  more  complicated  (than 
deriving  the  uniqueness  results)  since  we  cannot  compute  the  motion  field  accurately 
enough  from  time-varying  images.  The  inaccuracy  in  the  estimate  of  the  motion  field 
obtained  from  noisy  data  may  then  introduce  additional  ambiguities  in  the  interpretation 
of  three-dimensional  motion  and  shape  (Ullman  [1983],  Jerian  k.  Jain  [1983],  Adiv  [1985]). 
In  this  paper,  we  restrict  attention  to  the  types  of  ambiguities  that  arise  even  when  the 
exact  motion  field  is  used. 


4  Surface  Representation 

One  of  the  issues  in  deriving  uniqueness  results  for  curved  surfaces  is  the  choice  of  an  ap¬ 
propriate  representation  of  general  three-dimensional  surfaces.  For  example,  the  surfaces 
of  a  scene  may  be  represented  by  the  relationship  Z  —  Z  (X,  Y ) ,  where  R  =  [X,  Y,  Z]T  is  a 
point  on  the  surface  of  the  scene.  This  will  be  referred  to  as  a  global  representation  since 
we  can  define  the  whole  surface  in  this  form.  There  is  no  constraint  on  the  relationship 
among  the  depth  values  of  neighboring  points  and  the  surface  need  not  have  a  particular 
structure  in  local  regions;  for  example,  it  need  not  be  smooth.  The  surfaces  of  most 
physical  objects,  however,  possess  some  degree  of  "regularity”  at  least  in  local  regions; 
the  regularity  or  smoothness  of  a  surface  can  be  measured  in  terms  of  the  continuity  in 
the  surface  function  and  its  derivatives.  In  other  words,  most  surfaces  have  continuous 
derivatives  up  to  some  order  n  in  local  regions.  Then  we  can  represent  a  surface  by  its 
Taylor  series  expansion  up  to  order  n  in  a  local  region,  assumed,  for  simplicity,  to  be 
around  the  fixation  point: 

z  Z0  +  ZxX  +  ZyY+l-ZxxX 7  -f  ZxyXY  +  X-ZyyY 2  +  ...+ 

-,ZX  Y*"  +  ...  ♦  ~Zy  yY*  +  0(e). 

n  n: 

Alternatively,  the  surface  may  be  represented  by  a  Taylor  series  expansion  in  terms  of 
the  image  coordinates.  This  is  a  m ore  convenient  represe  »s*i<  n  when  we  deal  with 
images.  Finally,  we  may  write  the  expansion  of  d  1  //  is  a  measure  related  to 
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the  disparity  function  in  stereopsis),  instead  of  the  expansion  of  Z,  since  the  motion  field 
equation  depends  on  1/Z  (as  a  result  of  perspective  projection).  Therefore,  the  surface 
can  be  represented  by  the  equation 


d  =  d0  4-  dxx  +  d*y+  \dssx*  +  dxyxy  +  idyyy2  +  . . .  + 

«  2 

+  . . .  +  ^dy...yy*  +  °(e)- 


Ignoring  the  error  term,  this  can  be  written  more  compactly  using  tensor  notation, 


d  —  d0  +  dpi  +  2 dijrirj  +  •  •  •  +  n^dij...  rpj  •  •  •  > 

where  rt-  =  [x,  y}T.  Since  depth  can  be  recovered  only  up  to  a  scale  factor,  we  can  set 
d0  =  1.  Therefore,  we  can  write 


d  =  1  +  dpi  +  l-dijrpj  +  ...+  ri Ti 

This  will  be  referred  to  as  a  local  representation  since  it  may  be  the  true  representation 
of  the  surface  only  in  a  local  region  near  the  fixation  point.  One  justification  for  using 
a  local  representation  is  that  we  only  need  a  finite  number  of  parameters,  namely,  the 
coefficients  of  the  Taylor  series  (d,,  dtJ,  . . .  )  to  represent  the  surface,  where  the  number 
of  parameters  is  related  to  the  order  of  the  Taylor  series.  We  may  not  need  to  impose 
any  restriction  on  n  in  addressing  the  uniqueness  issue  since  we  deal  solely  with  a  purely 
theoretical  problem.  The  problem  of  robust  recovery  of  motion  and  shape,  however,  is  a 
separate  issue.  In  practice,  we  cannot  robustly  determine  the  coefficients  of  the  surface 
function  beyond  the  linear  terms  due  to  noise  in  the  data  (Adiv  [1985],  Le  Guilloux 
1986]).  Furthermore,  the  resulting  non-linear  problem  is  usually  ill-conditioned. 

Using  a  local  representation  of  curved  surfaces,  the  motion  vision  problem  reduces 
to  estimating  a  finite  number  of  unknowns,  namely,  the  motion  and  surface  parameters, 
from  the  image  data.  Similarly,  the  uniqueness  issue  translates  into  the  following  question: 
How  many  sets  of  motion/surface  parameters  can  give  rise  to  the  same  motion  field  over 
the  image  region  of  interest? 


5  Ambiguities  in  the  Interpretation  of  the  Motion  Field 

If  we  substitute  the  equation  for  d  =  1/Z  into  the  motion  field  equation,  we  arrive  at  the 
Taylor  series  representation  of  the  motion  field,  r*  "  ,  v,  O]7”,  in  the  image  region  under 
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consideration;  that  is, 

1  2  1  2 

u  =  u0  +  uxx  +  uf+-uxxx *  +  uxyxy  +  -ttyyy *+ 

I  It 

gux**x3  +  ^uxxyx2y  +  ^uxnxyi  +  gumy3  +  . . . 

1  2  1  2 

v  =  t>0  +  vxx  +  vy+-vxxx‘  +  vxyxy  +  -vyvy  + 

It  I 

g VXXSX 3  +  ^VXXyX2y  +  £ Vxyyxy 2  +  ^Vyyyy3  +  ... 

where  the  coefficients  of  the  Taylor  series  expansion  of  the  motion  field  are  given  by 


U0  =  -Wy  -  tX 

V0=tVX-  ty 

t lx  =  ^1^2 

Vx  =  - Wg  -  tydx 

t Ly  — -  tXdy 

Vy  ~~  t  z  tydy 

UXX  Wy  t%dX 

Vxx  —  2  tydXX 

UZy  —  tVZ  H"  tz(iy  tZdZy 

VXy  =  ~Wy  +  tzdx  ~  tydxy 

t Lyy  —  2  tXdyy 

Uyy  —  WX  “|“  tjgdy  2 tydyy 

uXxz  =  3tzdxx  —  txdzxx 

vxxx  =  —tydxxz 

uxxy  =  %tzdXy  —  tzdXXy 

Vxxy  =  —  tydxxy 

uxyy  =  tzdyy  —  txdXyy 

Vxyy  =  2  tgdxy  —  tyfdxyy 

uyyy  ~  ~^Xdyyy 

Uyyy  —  3tzdyy  t  y  dyyy 

The  question  regarding  the  uniqueness  of  the  solution  can  be  rephrased  as  follows: 
Under  what  circumstances  can  we  obtain  the  same  set  of  motion  field  coefficients  up  to 
some  order  »  for  different  sets  of  motion/surface  parameters? 

Considering  the  motion  field  coefficients  up  to  the  second-order  terms,  Longuet- 
Higgins  &  Prazdny  [1980]  showed  that  three  interpretations  for  the  motion  and  local 
structure  of  the  surface  (tangent  plane  orientation  and  surface  curvature)  are  generally 
possible.  They  arrived  at  this  conclusion  by  reducing  the  problem  of  determining  the 
motion  and  surface  parameters  to  solving  a  cubic  equation;  however,  they  did  not  show 
when  the  cubic  equation  can  possess  three  real  solutions.  Waxman  et  al.  [1986]  derived 
the  conditions  that  give  rise  to  the  ambiguity  observed  by  Longuet-Higgins  &  Prazdny 
(similar  results  were  derived  independently  by  Negahdaripour  &  Yuille  [1986]).  These 
conditions  can  be  categorized  as  follows: 

(1)  Three  solutions  are  obtained  when  dx  =  dy  =  0,  the  surface  has  a  negative  Gaussian 
curvature  [dxxdyV~d \y  <  0),  and  the  mean-scaled  curvature  is  uni  i  >  ^  {dxx+dyV)  =  1). 
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(2)  Two  solutions  are  obtained  if  the  surface  has  a  non-positive  Gaussian  curvature 
{dxxdyy  —  d\y  <  0),  the  mean-scaled  curvature  is  unity  (|(d2Z  +  dyy)  =  1),  and  the 
translation  along  the  line  of  sight  vanishes  ( tx  =  0).  When  dx/dy  =  tx/ty,  the  two 
solutions  become  equivalent. 

(3)  Two  solutions  are  obtained  regardless  of  the  sign  of  the  Gaussian  curvature  when 
dx/dy  =  tx/ty,  (this  condition  was  referred  to  as  the  structure-motion  coincidence). 
The  two  solutions  degenerate  to  a  unique  solution  when  the  translation  is  along  the 
surface  normal  or  when  tz  =  0. 

(4)  In  all  other  cases,  the  solution  is  unique. 

Longuet-Higgins  &  Prazdny  claim  that  these  ambiguities  are  usually  only  instantaneous, 
and  can  be  resolved  at  the  next  time  instant.  Negahdaripour  [1986]  shows  that  an 
ambiguity  in  the  interpretation  of  the  motion  field  arises  only  in  the  case  of  some  quadratic 
surfaces  with  a  non-positive  Gaussian  curvature  at  the  fixation  point.  More  importantly, 
these  surfaces  have  to  be  viewed  by  an  observer  translating  perpendicular  to  the  viewing 
direction.  Interestingly,  the  restriction  on  the  motion  of  the  observer  is  peculiar  to  the 
local  representation  of  curved  surfaces.  Furthermore,  the  resulting  motion  field  is  second- 
order;  but  this  obviously  does  not  imply  that  any  second-order  motion  field  is  ambiguous. 
The  ambiguity  in  the  case  of  quadratic  surfaces  that  are  viewed  by  an  observer  moving 
parallel  to  the  image  plane  (tz= 0)  is  either 

(1)  two/ three-fold  when  the  surface  gradient  vanishes  ( dx  —  dy  =  0),  the  Gaussian  cur¬ 
vature  is  zero/negative  ( dxxdyy  —  d\y  <  0),  and  the  mean-scaled  curvature  is  unity 
[\{dxz  +  dyy)  —  l),  or 

(2)  two-fold  when  the  Gaussian  curvature  is  negative  ( dxxdyy  —  d\y  <  0),  the  mean-scaled 
curvature  is  unity  (j(d**  +  dyy)  =  l),  and  the  surface  normal,  the  optical  axis,  and 
one  of  the  asymptotic  lines  of  the  quadratic  surface  are  in  the  same  plane;  that  is, 


dx  +  ydly  dxxdyy  <fz 

either  —  = - - -  or  — 

dy  dxx  dy 


When  dx/dy  =  tx/ty,  the  ambiguity  is  resolved  because  the  two  solutions  become 
identical. 


6  Surfaces  That  Give  Rise  to  an  Ambiguity 

We  have  given  the  conditions  under  which  there  may  be  an  ambiguity  in  the  interpretation 
of  a  given  motion  field  resulting  from  the  relative  motion  between  an  observer  and  a  curved 
surface.  Since  the  ambiguity  is  restricted  to  quadratic  surfaces,  we  can  ignore  the  higher 
order  terms  in  the  surface  function  given  earlier.  Therefore,  an  "ambiguous  surface”  is 
given  by 

d  =  I  +  diU  +  -d^r.ry, 
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which  can  be  written 


where 


d  =  ^rrDr, 


lyy 


If  we  multiply  both  sides  of  the  equation  for  d  —  \jZ  by  Z2,  we  obtain 


Z  =  ^RrDR 
2 

°r  1 

-*tR  +  ^RrDR  =  0, 

« 

which  is  the  equation  of  a  quadratic  surface  passing  through  the  origin  (viewing  point) . 
This,  however,  is  an  artifact  resulting  from  the  multiplication  of  both  sides  of  the  equation 
for  d  by  Z2\  that  is,  by  multiplying  both  sides  of  the  equation  for  d  by  Z2,  we  have 
artificially  made  the  origin  (R  =  0)  a  point  on  the  surface.  To  be  precise,  we  should 
define  the  surface  by 

-irR  +  2RtDR  =  0,  R  /  0. 


This  brings  up  an  interesting  issue  since  some  previous  results  derived  using  a  global  rep¬ 
resentation  of  curved  surfaces  suggest  that  an  “ambiguous  surface”  should  pass  through 
the  viewing  point  (Fang  <k  Huang  [1984],  Tsai  &  Huang  [1984],  Horn  [1986b]).  The 
inconsistency  in  the  interpretation  of  the  results  can  be  attributed  to  the  difference  be¬ 
tween  the  local  and  global  representations  of  the  surface  and  can  be  explained  through 
the  following  example:  Consider  an  observer  on  the  inside  surface  of  a  cylinder  viewing  a 
point  on  the  inside  surface  directly  across  from  him/her  (as  before,  a  viewer-centered  co¬ 
ordinate  system  is  assumed).  In  the  global  representation,  we  consider  the  whole  surface, 
whereas  in  the  local  representation,  we  only  consider  some  neighborhood  of  the  fixation 
point.  The  viewing  point  is  a  point  or.  the  surface  in  the  global  representation.  It  will 
not  be  so  in  the  local  representation.  The  difficulty,  however,  arises  when  we  consider 
the  behavior  of  the  motion  field  at  the  origin  of  the  image  plane.  The  question  is:  Which 
point  projects  onto  the  origin  of  the  image  plane,  the  viewing  point  or  the  point  directly 
across?  Consequently,  is  it  the  motion  of  the  viewing  point  or  the  point  across  that 
we  observe  (or  measure)  in  the  image?  Clearly,  it  is  the  latter.  Whether  we  do  or  do 
not  include  the  origin  as  a  point  on  the  surface  seems  to  depend  on  the  choice  of  the 
representation  we  use. 


The  signs  of  the  eigenvalues  of  the  matrix  D  determine  the  type  of  the  quadratic 
surface.  When  an  ambiguity  exists,  the  eigenvalues  of  D,  in  the  ascending  order,  are 
given  by  (see  the  Appendix) 


A_  =  1  -  y/l  +  p  +  q,  A„  =  2,  and  A+  =  1  f  \/l  p  +  q, 
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where 

p  =  d\  +  dy  and  q  =  d\y-  dxxdyy. 

In  the  case  of  a  three-fold  ambiguity,  we  have  p  =  0  and  q  >  0.  The  eigenvalues  of  D 
become 

A_  =  1  —  +  q  <0,  A0  =  2,  and  A+  =  1  +  y/l  +  q  >  2. 

Since  the  ambiguous  surface  has  one  negative  and  two  positive  eigenvalues,  it  has  to  be 
a  hyperboloid  of  one  sheet.  When  q  =  0,  that  is,  the  Gaussian  curvature  vanishes,  the 
ambiguity  degenerates  to  a  two-fold  one.  In  this  case,  we  have 

A_  =  0,  A0  =  2,  and  A+  =  2. 

and  the  hyperboloid  of  one  sheet  degenerates  to  a  circular  cylinder.  The  other  ambiguous 
situation  arises  when  the  optical  axis,  the  surface  normal,  and  an  asymptotic  line  are  in 
the  same  plane.  In  this  case,  p  >  0  and  q  >  0,  and  we  obtain 

A_  =  I  —  y/l  +  p  +  q  <0,  A0  =  2,  and  A+  =  1  +  \/l  +  p  +  q  >  2. 

Again,  the  surface  is  a  hyperboloid  of  one  sheet.  Note  that,  in  every  case,  we  have 
A_  -I-  A+  =  A0  and,  therefore,  there  is  a  constraint  among  the  eigenvalues.  We  conclude 
that  not  every  hyperboloid  of  one  sheet  or  circular  cylinder  can  give  rise  to  an  ambiguity. 

7  Relationship  Among  Multiple  Solutions 

The  conditions  given  above  allow  us  to  determine  when  we  may  expect  an  ambiguity  in 
the  interpretation  of  a  motion  field  and  the  number  of  possible  interpretations.  Generally, 
we  do  not  know  the  correct  interpretation  in  advance.  Therefore,  we  need  to  obtain  all 
of  the  possible  solutions  before  we  can  identify  the  correct  one;  it  is  often  difficult,  if  not 
impossible,  to  obtain  a  robust  closed-form  solution  in  the  case  of  curved  surfaces.  We  may 
need  to  rely  on  some  type  of  iterative  method  to  recover  motion  from  a  set  of  non-linear 
equations.  An  iterative  method,  however,  can  converge  to  only  one  of  up  to  three  possible 
solutions  (when  multiple  solutions  exist)  depending  on  the  initial  condition.  Several  runs 
of  the  iterative  algorithm  may  be  necessary  before  we  obtain  every  possible  solution;  this 
is  expensive  computationally.  Furthermore,  in  some  cases,  a  solution  may  be  hard  to 
obtain  if  it  has  a  small  radius  of  convergence  (this  is  important  if  it  is  the  true  solution). 
Finally,  the  true  solution  may  not  be  an  optimum  one  (in  the  least-squares  sense)  with 
noisy  data.  Therefore,  it  is  not  only  important  to  know  what  circumstances  can  give  rise 
to  an  ambiguity  and  the  number  of  possible  solutions,  it  is  equally  important  to  know 
the  relationship  between  the  true  and  spurious  solutions. 

Suppose  we  have  determined  a  motion/surface  pair,  {t,u>}  and  d  —  1  /Z,  that  is 
consistent  with  the  data,  where  the  surface  is  given  by 

d  =  1  +  diti  +  ^r.ry. 


10 


Motion  of  Planar  Objects 


Furthermore,  suppose  we  know  that  the  solution  is  not  unique  (that  is,  the  solution  we 
have  obtained  either  fully  or,  due  to  noise,  approximately  satisfies  the  conditions  for  an 
ambiguous  case).  Then  we  can  find  another  (two  other)  motion/surface  pair(s),  {t,  £} 
and  d,  that  is  (are)  consistent  with  the  data,  where  the  surface  d  is  given  by 

r  r  1  r 

d  =  1  +  d,r,  +  -dijur,. 

Negahdaripour  [1986]  has  derived  the  relationship  among  the  multiple  solutions  in  the 
two  potentially  ambiguous  cases  described  earlier. 


7.1  Case  One:  Three-Fold  Ambiguity  of  Hyperboloids  of  One  Sheet 


There  may  be  three  interpretations  if  the  surface  of  the  scene  is  a  hyperboloid  of  one 
sheet.  The  “surface  of  ambiguity”  is  characterized  by  the  following  conditions: 

(1)  the  surface  gradient  is  zero  ( dx  =  dy  =  0), 

(2)  the  Gaussian  curvature  is  negative  {dxxdyy  —  4,  <  0),  and 

(3)  the  mean-scaled  curvature  is  unity  (j(dzz  +  dyy)  =  l). 

The  other  solutions,  in  terms  of  the  first  solution,  are  given  by 


tx 

=  aty 

: 

=  VJX 

A/ 

h 

—  [txdy  y  +  atydxx)/ (2a) 

/v 

Uy  ■■ 

=  wy 

tz 

=  0 

/X/ 

UJg  - 

=  IV z 

l 

=  1 

dxx 

—  ty 

dx 

=  0 

djy 

—  — 1 

dy 

=  0 

dyy 

=  t*  1 

(atxdxx  tydyy}  j  (2aty) 


where 


a  = 


dxy  ±  \Jd%y 


dxxdyy 


and  k  =  ( aty  —  tx)ty. 


For  each  solution  of  a,  we  obtain  one  “spurious”  solution  from  these  equations.  We 
can  alternatively  determine  both  spurious  solutions  using  only  one  of  the  solutions  for 
a.  This  requires  that  we  derive  another  set  of  relationships  for  the  dual  solutions.  These 
are  given  by 

uz  =  wx  —  ka/(atx  +  ty) 

IV y  XVy  "  k  I  (Octx  “H  ty) 

U)z  =  wz 

dxx  =  2 ty/(atx  -i  ty) 
dXy  (tz  &ty)  j  [ottx  “i"  ty) 

dyy  —  2atx/(atx  4-  ty) 


tx  =  ( atx  +  ty)dyy/(2a) 

ty  ~  ty)dxx/2 

tz  =  0 

l  -  1 

<4  =  0 


dy  —  0 


where 


k  =  -(at*  +  ty)(atxdxx  -  tyd f?a). 
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To  summarize,  we  can  determine  the  two  “spurious”  solutions  either  (1)  by  substi¬ 
tuting  the  two  values  obtained  for  a  into  one  of  the  two  sets  of  relationships  given  above 
or  (2)  by  substituting  one  of  the  two  solutions  of  a  into  both  sets  of  equations.  When 
the  Gaussian  curvature  is  zero  (with  other  conditions  being  the  same),  the  surface  degen¬ 
erates  to  a  circular  cylinder  and  the  ambiguity  reduces  to  a  two-fold  one  since  the  two 
solutions  for  a  become  identical. 


7.2  Case  Two:  Two-Fold  Ambiguity  of  Hyperboloids  of  One  Sheet 

Certain  hyperboloids  of  one  sheet  can  give  rise  to  a  two-fold  ambiguity.  The  surface  of 
ambiguity  is  characterized  by  the  following  conditions: 

(1)  the  Gaussian  curvature  is  negative  ( dxxdyy  —  4„  <  o)» 

(2)  the  mean-scaled  curvature  is  unity  ( \[dxx  +  dyy)  =  1),  and 

(3)  the  surface  normal,  the  optical  axis,  and  one  of  the  asymptotic  lines  of  the  quadratic 
surface  are  in  the  same  plane;  that  is, 


dx  ~dxv 
either  —  = - 

dy 


+  \]  d*y  dxxdyy 


or 


dx  ~dxy  ~  \J~d ly  dxxd, 

dy 


yy 


•■XX 


The  dual  solution  is  given  by 


tx  -  ftty 

{jJz  —  IV  x  kdyy  !  {^h(Xty^ 

ty  ~  {tz(lyy  H“  CttydxX ) /(20f) 

C dy  %Vy  kdXx/{2ty) 

tz  =  0 

VZ  =  Wg  +  kdy/ty 

l  =  i 

dxx  ~  tydzx/ty 

dx  ~~  ^xdyfty 

dXy  ~  [ottxdxx  tydyy)  /  (2ttty) 

II 

PL 

dyy  =  tgdyy  j^Otty ) 

where  a  =  dx/dy  and  k  =  ( aty  —  tx)ty.  When  dx/dy  =  txjty>  the  ambiguity  is  resolved 
because  the  two  solutions  become  identical. 

8  Imposing  Depth-Positiveness  Constraint  to  Resolve  Ambiguity 

In  many  cases,  an  incorrect  interpretation,  when  more  than  one  exists,  may  be  ruled  out 
due  to  the  fact  that  it  violates  physical  constraints.  One  such  constraint  is  that  the  depth 
value  for  a  point  that  is  projected  onto  the  image  should  be  positive;  that  is,  a  point  can 
be  seen  only  if  it  is  in  front  of  the  camera  (see  also  Horn  [1986b]). 

Consider,  once  again,  the  surface 

d-l  +  d,r,  +  ~d|'yr,Ty. 
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Since  d  —  \/Z,  then  d  =  0  implies  that  Z  — »  oo.  Therefore,  the  conic  section 

1  +  d,r,  +  ^<*,yrtry  =  0 

is  the  image  of  points  at  infinity.  It  is  also  the  boundary  between  regions  with  positive 
and  negative  depth  values.  The  region(s)  with  positive  depth  values  can  be  the  image  of 
the  surface  under  consideration  whereas  the  regions  with  negative  depth  values  cannot. 
A  negative  depth  value  corresponds  to  a  point  on  a  surface  that  either  is  behind  the 
camera  (or  translates  in  the  opposite  direction).  The  point  cannot  belong  to  the  same 
physical  surface  that  is  projected  onto  the  region  of  the  image  with  positive  depth  values. 
The  regions  with  negative  depth  values  are  the  image(s)  of  the  background  and/or  other 
objects  in  the  scene.  Therefore,  the  conic  section  is  the  boundary  between  the  object 
under  consideration  and  the  background  or  other  objects  in  the  scene.  We  can  use  these 
boundaries  to  identify  the  correct  solution,  provided  that  the  field  of  view  is  large  enough 
so  that  the  image  includes  some  portion  of  these  boundaries. 

It  is  more  instructive  to  study  an  ambiguous  situation.  We  consider  the  case  that  gives 
rise  to  a  three-fold  ambiguity  since  it  is  easier  to  derive  the  equations  for  the  minimum 
size  of  the  field  of  view.  In  this  case,  the  surface  is  given  by 

d  =  1  +  ]rdxxx 2  +  dxyxy  +  \dyyy2 

with  the  constraints 

d’Zzdyy  ^  0  and  “(dj2  rfj/jf)  ~  1* 

It 

Suppose  the  image  plane  is  circular  with  radius  r  (half-angle  field  of  view  is  tan_1(r) 
degrees).  In  order  to  have  some  portion  of  the  boundaries  of  the  true  surface  within  the 
field  of  view,  we  should  have  r  >  rc,  where  rc  is  the  shortest  distance  from  the  origin  of 
the  image  plane  to  a  point  on  the  conic  section 

1  -f"  ~dxxx  dxyxy  4-  ~ = 

Ideally,  we  want  a  larger  field  of  view  so  that  we  can  have  as  much  of  the  boundary  in 
the  image  as  possible. 

When  dxy  ^  0,  it  can  be  shown  that  rc  =  y/ x%  +  yf ,  where 


and 


=  ±< 


-2 


dyym 2  +  2  dxym  +  dxx 


,  yc  =  mxc, 


d'VV  drr  /  /  dyy  d: 


m  — 


2d 


zy 


zy 


Only  one  of  the  two  signs  gives  a  value  of  m  that  makes  the  solution  for  xe  and  ye  real¬ 
valued  (the  proof  follows  easily  from  the  geometry  of  the  problem).  When  d%9  —  0,  then 
either  dxx  <  0  or  dyy  <  0  (because  dxxdLyy  -  dPsy  —  dyydxx  <  0).  In  this  case,  we  instead 
have  either 


We  can  derive  similar  expressions  for  re,  the  shortest  distance  from  the  origin  of  the  image 
plane  to  a  point  on  the  conic  section  of  either  spurious  solution.  Now  if  rc  >  r  >  rc,  the 
image  includes  some  portion  of  the  boundaries  of  the  true  surface  but  no  part  of  the 
boundaries  of  the  spurious  solution  or  if  r  >  re  and  r  >  re,  then  the  image  should  include 
some  portion  of  the  boundaries  of  both  the  true  and  spurious  solutions.  It  is  only  when 
r  <  min(rc,rc)  that  we  cannot  identify  the  true  solution  since  then  the  depth  values  are 
positive  everywhere  in  the  image  for  every  possible  solution. 

Example:  Consider  a  viewer  moving  with  translational  velocity  t  =  [l,2,0]r  with  re¬ 
spect  to  surface  d  defined  by 

d  =  1  +  0.5x2  +  lOxy  +  0.5y2. 

Using  the  equations  given  earlier,  the  spurious  solutions  are: 

(1)  An  observer  moving  with  translational  and  rotational  velocities  tj  =  [.445,  -8.98, 0)r 
and  cDi  =  [  —  11.0,  .550,  Op  with  respect  to  surface  di  given  by 

di  =  1  +  — .112x2  -  2.l7zy  +  l.lly2. 

(2)  An  observer  moving  with  translational  and  rotational  velocities  tj  =  [-19.5,  .975, 0]r 
and  cDj  =  [-1.03, 20.5, 0]T  with  respect  to  surface  dj  given  by 

d<i  =  1  +  +1.03x2  —  .461xy  —  .026y2. 

The  boundaries  of  the  three  surfaces  are  shown  in  Figure  2.  For  each  solution,  the  regions 
of  negative  depth  values  are  shown  by  hatched  lines.  The  resulting  second-order  motion 
field  is  given  by 

(—1  -  0.5x2  —  lOxy  —  0.5y2  \ 

-2  -  x2  -  20xy  -  y2  1  , 

which  is  shown  in  Figure  3  (the  image  plane  is  a  unit  square;  that  is,  the  field  of  view  is 
2  tan  *(0.5)  =»  54  degrees).  Note  that  the  velocity  vectors  are  all  parallel  emanating  from 
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the  focus  of  expansion  at  infinity  in  the  region  where  the  depth  values  are  positive  (the 
region  including  the  bottom-left,  the  center,  and  the  top-right  of  the  image).  They  point 
in  the  opposite  direction  in  the  bottom-right  and  top-left  of  the  image.  This  implies  that, 
for  points  in  these  regions,  either  the  motion  is  in  the  opposite  direction  (to  the  true  one) 
or  the  depth  values  are  negative.  In  either  case,  these  regions  cannot  be  the  images  of 
parts  of  the  same  object  that  is  imaged  into  the  region  with  positive  depth  values. 

The  apices  of  the  conic  section  of  the  true  surface  are  located  at  i  ( 1  3,  1  '3)  If 

the  field  of  view  is  larger  than  2  tan  1  ( \/2 / 3 )  51  degrees,  we  can  identify  the  correct 
solution  by  matching  the  boundaries  of  depth  discontinuity  in  the  image  with  the  conic 
section  of  the  true  surface. 

The  two  spurious  solutions  involve  a  viewer  rotating  about  an  axis  parallel  to  one  of 
the  asymptotic  directions  and  translating  parallel  to  the  other  asymptotic  direction  of 
the  true  surface.  This  is  quite  counter-intuitive  since  the  motion  field  suggests  that  the 
underlying  3D  motion  is  purely  translational. 

9  Summary 

In  this  paper  I  have  presented  some  results  concerning  the  ambiguity  in  the  interpretation 
of  the  motion  of  curved  surfaces.  These  results  suggest  that  only  certain  hyperboloids  of 
one  sheet  or  circular  cylinders  viewed  by  an  observer  moving  parallel  to  the  image  c an 
give  rise  to  an  ambiguity  in  the  interpretation  of  the  underlying  motion.  In  the  case  of 
hyperboloids  of  one  sheet,  the  ambiguity  can  be  either  two-fold  or  three-fold,  whereas 
there  can  he  at  most  two  solutions  in  the  case  of  circular  cylinders.  In  either  case,  the 
resulting  motion  field  is  second-order.  I  have  also  given  analytical  expressions  for  the 
relationship  among  multiple  solutions.  In  most  cases,  an  ambiguity  can  be  resolved  by 
imposing  the  positive-depth  constraint. 


Asymptotic 
Direction  of 
SlUfKI  1 

Viewer  s 


Figure  2.  The  images  of  *>»«  boundaries  of  three  hyperboloid*  of  one  sheet  responsible  for  a  threefold 
ambiguity  The  ambiguity  in  a  motion  field  is  usually  restricted  to  small  region  of  the  image  For  a 
sufficiently  large  field  of  view,  the  ambiguity  can  be  resolved  by  matching  the  depth  discontinuity  boundaries 
with  the  conic  section  of  the  true  surface 
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Figure  3.  A  motion  field  with  three  rigid  body  motion  interpretation*. 
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Appendix 


(  The  eigenvalue*  of 

(daa  da  \ 

d,f  dyf  dj,  I 

d,  d,  2  ) 

are  the  solutions  of  the  characteristic  equation 

4  fljA^  +  Oi  A  4  s,  =  0, 

where 

at  -  -(2  4-  daa  4-  dy9), 

oi  -  2(d„  4-  <Lgt)  4  {daad9t  -  d*()  -  {d\  4-  dj), 

o„  =  ~2 (dIxd99  -  d^f)  4-  (dSMd*  -  2 dgpdgdf  4 

One  of  the  conditions  in  the  case  of  an  ambiguity  is  that  the  mean-scaled  curvature  is 
unity;  that  is,  j(daa  4  d^f)  =  1.  Therefore,  we  have 

a,  =  -4, 

#  *.  =  <  +  (4,4,  -  <,)  -  (<*’,  +  <4). 

a0  =  —  2(daad99  —  djy)  4-  (dssdg  —  2 dgfdgdf  4  djfd\). 

Another  condition  is  that  either  the  surface  gradient  should  vanish  [dM  =  a,  =  0)  or  the 
optical  axis,  the  surface  normal  and  one  of  the  asymptotic  lines  of  the  surface  are  in  the 
same  plane.  In  either  case,  we  obtain 

daad*  —  2dttdtdr  4  dtfd\  =  0. 

Using  this  in  the  earlier  equations  and  simplifying  the  results,  we  can  show  that  the 
characteristic  equation  simplifies  to 

(A  -  2)(AJ  2A  (p  4  q))  =  0, 

where 

p  =  d\  4  d\  and  q  =  d\9  -  dtad„ 

Therefore,  the  eigenvalues  of  D,  in  the  ascending  order,  are 

A_  —  1  -  y/\  4  p  4  q,  A0  =  2,  anH  A*  =  1  4  y/l  4  p  T  q 

m 
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